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Abstract: Selective cyclooxygenase inhibitors have attracted much attention m recent times 
in the design of new non-steroidal anti-inflammatory drugs (NSAID), 3D-QSAR studies 
have been performed on a series of l,5-diarylpyra2oles that act as selective cyclooxygenase- 
2 (COX-2) inhibitors, using three different methods: comparative molecular field analysis 
(CoMFA) with partial least squares (PLS) fit; molecular field analysis (MFA) and; receptor 
surface analysis (RSA) with genetic function algorithms (OF A). The analyses were carried 
out on 30 analogues of ^vfaich 25 were used in the training set and the rest considered for the 
test set. These studies produced reasonably good predictive models v/ith high cross-validated 
and conventional r^ values in all the three cases. 

Keywords: NSAID design, selective COX-2 inhibitors, 3D-QSAR, CoMFA, MFA, RSA. 



Introduction 

Cyclooxygenase (COX) converts arachidonic acid to prostaglandin (PG)H2 and subsequently to a 
number of other prostaglandins which are potent mediators of inflammation. COX exists in two differ- 
ent isoforms, namely, COX-1 and COX-2 [1] COX-1 is constitutively expressed in tissues and is re- 
sponsible for the physiological production of prostaglandins. However, COX-2, the induced isoform, is 
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responsible for the elevated production of prostaglandins during inflammation [2]. Thus die selective 
inhibition of COX-2 is very imponant in the design of NSAID molecules. Inhibition of COX-1 pro- 
duces undesirable gasrroinrestinal side effects and riierefore selectivity is a highly desirable attribute in 
a potential NSAID [3], Thus our main objective is to design specific inhibitors of COX-2 in the hope 
diat these molecules may be further explored as powerftd non-ulcerogenic anti-inflammatory agents. 
Though both structure- and analogue-based drug design methods have been used in NSAID design in 
the past, only the latter type of studies have been carried out in the present study. 

Computational Methods 

Molecular 3D Structure Building 

Widi the satisfactory understanding of the model of the drug action of many NSAIDs, there has 
been an increased impetus in the synthesis of many COX-2 specific diarylcyclopentane and related het- 
erocyclic systems. From one such report wherein a number of diarylpyrazoles were synthesized and the 
biological activity evaluated for both COX-1 and COX-2, 30 molecules were randomly selected for the 
present study (Scheme 1, Tables 1 and 2) [4], All inhibitors were modelled with Sybyl. Initial geomet- 
ric optimizations were carried out using ±e standard Tripos force field, with a 0.001 Kcal/mol energy 
gradient convergence criterion and a distance-dependent dielectric constant employing Gasteiger 
charges [5]. Further geometric optimizations were performed using MOP AC with the AJVll Hamilto- 
nian and derived MOPAC charges were used for the subsequent analysis [6,7]. The final geometry of 
the molecular skeleton is very similar to that of SC-558 and Celecoxib [2.8]. 




Scheme 1. 
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Alignment 

Fragment 1 is common to all the molecules diat were considered in this study and the molecules 
were aligned with respect to this fragment using the simple ahgriment method in Sybyl (Figure 1). 
Alignment by different methods such as field fit or pharmacophore fit were also carried out. However, 
these studies did not produce any significant deviation in the results obtained from the earlier studies. 




Fragment 1 

Figure 1. Fragment 1 and alignment of all molecules using the Sybyl database alignment procedure. 

CoMFA 

CoIVIFA fields were generated using the standard Tripos field and 3D-QSAR analysis was per- 
formed by the PLS method [9]. For each cross- validated CoMFA analysis, the minimum a value was 
set to 2 to expedite calculations. For the non cross- validated CoMFA analyses, the niinimum a value 
was set to 0. The steric and electrostatic field energies were calculated using an sp' carbon probe atom 
with a -rl charge. The CoMFA grid spacing was 2,0 A in all three dimensions within the defined re- 
gion, and diis was extended beyond the van der Waals envelopes of all molecules by at least 4.0 A. The 
optimal number of components in the final PLS model was determined by minimizing the standard er- 
ror between the predicted and actual activities, obtained from the leave-one-out cross-validation tech- 
nique. It is essential to assess^ the predictive power of the CoMFA model by using a test set of com- 
pounds. Therefore, among the 30-inhibitors initially considered, 25 were randomly selected to be in- 
cluded in the training set and the remaining five were used in the test set. The molecular systems were 
rotated in the initially defined field box to test for orientation effects. However, no such effect was ob- 
served. Similarly, q'-GRS studies employing a grid spacing of 1.0 A and a cut-off of 0.1 and 0.2 did 
not improve the r'cv to any significant extent. Figures 2a and b represent the CoMFA contour maps of 
steric and electrostatic contributions. 
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(2b) 

Figure 2, Stereoviews of steric (a) and electrostatic (b) CoMFA contour plots for pyrazole 1. (a) Bulky 
substiruents enhance activity in the regions shaded green and depress activity in the yellow regions, (b) 
Electropositive substituents-^nhance activity in the regions shaded blue and depress activity in the red 

regions. 

MFA 

MOPAC minimized structures were also read into Cerius" and all the molecules were aligned with 
respect to Fragment 1 (Figure 3) [10]. The molecular field was created using as probes, the methyl 
group and a proton for steric and electrostatic interactions respectively. Many of the spatial and struc- 
tural descriptors such as polarizability, dipole moment, radius of gyration, molecular area, molecular 
dimensions, density, principal moments of inertia, molecular volume, molecular weight, number of 
rotatable bonds, hydrogen bond donors and acceptors, log P, molar rejfractivity and others were also 
considered along with fieid values [11], Only 10% of the total variables whose variance is highest were 
considered as independent variables. The negative logarithm of the biological activity was chosen as 
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the dependent variable in the generation of QSAR equations using the GFA regression method (with 
only linear terras involved in the equations) [12]. All 25 molecules in the training set were considered 
as observations. The other specificarions are similar to those given in the following section. 




Figure 3. Molecular alignment (Cerius') based on Fragment 1 . Note the very similar molecular super- 
positions in both Figures 1 and 3. Still, some subtle differences may be observed. 

RSA 

Previous Cerius" aligned molecules were reconsidered for the generation of a receptor surface [13], 
The receptor surface was generated with weights based on the biological activity data. The interaction 
energies of all the molecules were evaluated within this receptor model. The receptor surface descrip- 
tors, expressed as field values based on the probes of methyl group and a proton, were added to the 
study table along with various types of interaction energies for the QSAR study. Regression analysis 
was carried out using the GFA method consisting of over 20,000 generations and with specific inclu- 
sion of constant, linear, spline, quadratic, offset quadratic and quadratic spline variable terms in a 
QSAR equation with do fixed length and with no scaling. 

Results and Discussion 

CoMFA 

The CoiVlFA model with 25 compounds produced an r'cv value of 0.659 (maximum value of r'cv 
was obtained with a minimum of 6 components) and a conventional correlation coefficient (r^) of 
0.988 with the standard error of estimate being 0.149. The relative contributions of the steric and elec- 
trostatic fields are 0.625 and 0.375 respectively. The real vahdity of the mode! is expressed in terms of 
its abilit>^ in the prediction of biological activity for new molecules, in other words to predict the activ- 
ity of those compounds not included in the building of the model. A close analysis of different validity 
tests indicates diat the model generated by the CoMFA analysis is very good. While the actual and pre- 
dicted activities for the training set are given in Table 1, Table 2 contains the same data for the test set 
molecules. Table 3 contains additional information regarding model quality. 
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Table 1. Structures, and experimental and calculated inhibitory activities, -(log IC50). of the molecules 
used in the trainina set based on the molecular skeleton defined in Scheme 1 . 



Comp. No. 


Ri, R2 


R3 


Activity 
-(loglCso) 


Calculated activity 


CoxWA 


iVIFA 


RSA 


1 


3_CH3-4-SCH3 


CF3 


2.43 


2.23 


2.40 


2.10 


2 


3-CI-4-NHCH3 


CF3 


1.57 


1.60 


].15 


0.53 


3 


3-CI-4-OCH3-5-CH3 


CF3 


1.18 


1.04 


1.14 


0.97 


4 


2.4-di-Cl 


CF3 


1.25 


1.15 


1.25 


0.64 


5 


2,4-di-CH3 


CF3 


0.92 


1.07 


0.93 


0.72 


6 


2-F 


CF3 


1.24 


1.22 


1.19 


1.51 


7 


4-F 


CF3 


•1.39 


1.15 


1.65 


1.58 


8 


2-CI 


CF3 


1..25 


1.34 


1.25 


0.64 


9 


2-Me 


CF3 


1.16 


1.20 


1.01 


1.42 


10 


3-Me 


CF3 


0.96 


1.14 


1.08 


1.4.1 


11 


. 4-CF3 


CF3 


-0.92 


-0.85 


-1.12 


-0.47 


12 


2-OMe 


CFj 


0.54 


0.67 


-0.60 


0.32 


13 


4-OEt 


CF3 


0.19 


0.27 


0.55 


-0.42 


14 


4-SMe 


CF3 


2.05 


2.22 


2.02 


2.02 


15 


2-NMe2 


CF3 


-1.16 


-1.34 


-0.67 


-1.04 


16 


4-NHMe 


CF3 . 


1.80 


1.90 


0.82 


1.52 


17 


4-CO2H 


CF3 


-1.05 


-1.05 


-1.13 


-0.87 


18 


3-C2H5-4-OCH3 


CF3 


0.37 


0.40 


-0.07 


0.96 


19 


3,4-di-OCH3 


CF3 


0.22 


0.27 


1.65 


-0.02 


20 


^-SOzMe 


CF3 


-2.00 


-2.08 


-1.68 


-1.18 


21 


4-CO2H 


CHF2 


-1.67 


-1.48 


-1.59 


-1.60 


22 


4-OMe 


CHF, 


1.82 


1.83 


1.48 


1.76 


23 


3-CI-4-OCH3 


CHF2 


1.57 


1.44 


1.31 


1.55 


24 


3p^i-Cl-4-OCH3 


CHF: 


1.68 


1.61 


1.43 


2.29 


25 


3J-di-F-4-OCH3 


CHF2 


0.46 


0.39 


0.31 


0.08 



Table 2. Structures, -^d experimental and predicted inhibitory activities, -(log IC50), of the molecules 

used in the test set. 



Comp. No. 


Ri, R2 


R3 


Activity 
-(loglCjo) 


Predicted Activit/ 


CoMFA 


MFA 


RSA 


26 


4-Cl 


CF3 


2.00 


1.04 


1.84 


1.25 


27 


4-Me 


CF3 


1.40 


0.79 


1.11 


1.24 


28 


4-NO2 


CF3 


-0.42 


-0.77 


-1.72 


0.33 


29 


4-NMe2 


CF3 


2.33 


2.32 


-0.70 


0.89 


30 


3-F-4-OCH3 


CHF, 


1.30 


1.40 


1.54 


1.32 



^rms values for the three models are 0.534, 1.486 and 0.803 respectively 
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Table 3. Details of CoMFA, MSA, and RSA calculations. 





CoMFA 


MFA' 


RSA" 


f cv 


'0.66 


0.73 


0.77 




0.99 (0.62 and 0.37) ^ 


0.86 


■0.90 


No. of components 


6 


4 


4 


press'' 


0.4 


9.7 


7.756 


Standard deviation 


0.13 


0.43 


0.4 



^Conventional and the steric and electrostatic contributions are given in the parenthesis. 

^PRESS = predicted sum of squares is the root mean square error of all target predictions. 
'QS.\R equation: Activity = 0.947055 - 0.25882 l(Ele/401) + 0.0856 12(vdW/392) -h 
0,122799(Ele/391) - 0.7848(vdW/350). 

^QSAR equation: Activity = 0.546816 - 624.92 l(vdW/l 726)^ + 31.65(vdW/505)- + 
602. 1 1 6(vdW/l 563)^ - 50. 1 242(-0. 1 56 1 -vdW/5 1 if -0. 1 1 822(Ele/l 673). 

CoMFA coefficient contour maps 

The results of QSAR analysis by CoMFA, with its thousands of terms, was generally represented ki 
the form of three-dimensional "coefficient contour' maps. The CoMFA steric and electrostatic fields 
are represented as coloured contour regions in Figures 2a and 2b respectively. For reference, molecule 
1 is displayed in both maps. The green polyhedra in Figure 2a indicate the regions where more bulky 
substituents are expected to increase the activity; in Figure 2b, any electropositive substituents in the 
blue regions or electronegative substituents in the red regions enhance the activity. With substituents in 
appropriate positions, more than one effect maybe anticipated (Table 2). 

Figures 2a and 2b show the absence of any CoMFA contouring in the ring A region. This is not sur- 
prising because all the molecules in the training set are identical in this region. In the absence of any 
data pertaining to the effect of substitution on ring A, one is unable to say whether or not substitution 
on this ring will lead to activity variations. However, when the substitution on ring B is varied, there is 
a significant variarion in activity. The electrostatic contour (Figure 2b) shows a favourable interaction 
of a more electronegative substituent at the 3-position of the ring and this could be due to the presence 
of both CF3 and CHF2 groups among the training set molecules, with the former generally being the 
more active. 

MFA 

The QSAR model with 25 compounds yielded a r^v of 0.730 and a conventional correlation coeffi- 
cient (r) of 0.860. The predictive ability of this MFA model was evaluated by predicting the biological 
activities of the test set molecules. The predicted and actual activities of the training set and test set 
molecules are given in Table 1 and 2 while Table 3 features some of the data relating to the validation 
tests. Figure 4 is the stereoview of the molecules in the training set with a rectangular field grid. Only 
those field points involved in the QSAR equation are shown in the figure. It is noteworthy that a soli- 
tary grid point near ring A is included along with the several grid points near ring B in the QSAR 
equation. 
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Figure 4. Stereovievv of the molecular rectangular field grid around the superposed molecular units. 
Both steric (CH3) and electrostatic QT) grid points in the final QSAR equation are labelled. 

RSA 

The QSAR model generated using the RSA produced a r'cv of 0.77 and a conventional correlation 
coeificient (r) of 0.9. The predictive ability of this model was evaluated by predicting the biological 
activities of the training set and test set molecules and the actual, predicted activities are given in Ta- 
bles 2 and 3. Figure 5 is a stereoview of the receptor surface also showing the molecules in the training 
set. The violet and green colours indicate favourable and unfavourable interactions respectively, be- 
tween the molecules and the receptor surface, hi Figure 5, while the pyrazole ring and ring A show 
generally favourable interactions, ring B is not optimized. Thus, substitution patterns may be changed 
such that the interactions are also optimized, with concomitant increase in activity Thus all the grid 
points that are part of the QS/JR equation are pan of this region. The receptor model also supports the 
models generated by the other two methods. 




Figure 5. Stereoview of the receptor surface with all molecules considered and weights based on bio- 
logical activities. Interaction energies of individual molecules on the receptor surface are coloured 
(violet for a favourable interaction and green for an unfavourable one). The grid points involved in the 
final QSAR equation are labelled which mainly originates from different substimted phenyl ring which 
is neither violet nor gieen indicating substitution maneuvering is possible diere without significant loss 

of biological activity. 
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It is of interest to compare the tiiree models (Table 3, Figures 6a, 6b) with respect to the actual and 
calculated activities and residuals for all the molecules in the training set. Here, it appears that the 
CoMFA model is slightly better than the other two (lesser number of off-diagonal points in Figure 6a 
and bars of smaller height in Figiure 6b). When only the test set is considered^ the same 'trend continues 
as seen from the order of mis values in Table 2. The CoMFA model performs better even when strong 
electron withdrawing and electron donating groups (MO?, NMe2, F, OMe) are present. J/IFA and RSA 
models fare better when steric interactions involving CI and Me substituents are considered. These 
trends are reflected in pan in the training set. It appears then that each method has advantages and dis- 
advantages. Studies of the present type should not be confined to one model alone. 
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Figure 6. (a) Plot of actual vs. calculated biological activities of the training set molecules Lq the three 
methods of analysis, ^b) Plot of residuals in the three methods of analysis. In general, CoMFA pro- 
duced a better model. 
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Conclusions 

Three different analogue-based rational drug design methods have been used in the optimisation of 
COX-2 selective inhibitor design. All three methods produce reasonably good models based on which 
biological activities for the new molecules can be carried out. 
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